Practical Algorithms for Pattern Based Linear Regression

نویسندگان

  • Hideo Bannai
  • Kohei Hatano
  • Shunsuke Inenaga
  • Masayuki Takeda
چکیده

We consider the problem of discovering the optimal pattern from a set of strings and associated numeric attribute values. The goodness of a pattern is measured by the correlation between the number of occurrences of the pattern in each string, and the numeric attribute value assigned to the string. We present two algorithms based on suffix trees, that can find the optimal substring pattern in O(Nn) and O(N) time, respectively, where n is the number of strings and N is their total length. We further present a general branch and bound strategy that can be used when considering more complex pattern classes. We also show that combining the O(N) algorithm and the branch and bound heuristic increases the efficiency of the algorithm considerably.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling Climatic Parameters Affecting the Annual Yield of Rheum Ribes Rangeland Species using Data Mining Algorithms

Identification of climatic characteristics affecting the annual yield of Rheum Ribes can be useful in management and development of this species in the rangelands. In this research, the annual yield of this species in Khorasan-Razavi province based on 74 climatic parameters during a ten-year period evaluated and affecting climatic parameters extracted using data mining methods. First, the role ...

متن کامل

A Combinatorial Algorithm for Fuzzy Parameter Estimation with Application to Uncertain Measurements

This paper presents a new method for regression model prediction in an uncertain environment. In practical engineering problems, in order to develop regression or ANN model for making predictions, the average of set of repeated observed values are introduced to the model as an input variable. Therefore, the estimated response of the process is also the average of a set of output values where th...

متن کامل

Modeling and forecasting US presidential election using learning algorithms

The primary objective of this research is to obtain an accurate forecasting model for the US presidential election. To identify a reliable model, artificial neural networks (ANN) and support vector regression (SVR) models are compared based on some specified performance measures. Moreover, six independent variables such as GDP, unemployment rate, the president’s approval rate, and others are co...

متن کامل

A Thinning Method of Linear And Planar Array Antennas To Reduce SLL of Radiation Pattern By GWO And ICA Algorithms

In the recent years, the optimization techniques using evolutionary algorithms have been widely used to solve electromagnetic problems. These algorithms use thinning the antenna arrays with the aim of reducing the complexity and thus achieving the optimal solution and decreasing the side lobe level. To obtain the optimal solution, thinning is performed by removing some elements in an array thro...

متن کامل

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005